ikrami2000@hotmail.com

ikrami.samy@peinfosys.com

 

 

IT INFRASTRUCTURE LIBRARY (ITIL )

 

 

 

1 Executive Summary

Today's businesses rely on the many business processes and services supported by IT infrastructure. It is well known that IT failures often lead to significant adverse impact on IT services and hence the business. With this in mind, companies are seeking to align IT with business objectives to ensure that the IT infrastructure consistently supports the business.

To help accomplish this goal, ITIL is becoming an increasingly popular standard as it defines a set of best practices for IT Service Management, thereby ensuring that business requirements are cost-effectively met.

ITIL benefits both the customer/user and the IT organization. As IT services become more clearly focused on business objectives and service agreements, customers will see an improvement in their business relationships. Further, this alignment enables IT to better describe services in language the customers can understand, and improves communication through defined points of contact. ITIL empowers the IT organization to improve efficiency through standardized processes, such as easily implementing changes in the infrastructure to continuously improve IT services. The role of IT becomes more clearly defined as it is integrated with business objectives and critical decisions are more easily made. Finally, the quality and cost of the services are better managed, providing the required service levels at acceptable costs.

The Concord SPECTRUM Business Unit is dedicated to helping businesses accomplish their ITIL efforts. We provide a full suite of products to enable effective IT Service Management. Further, where ITIL processes expand beyond activities that can be achieved by technology alone, our business processes help organizations implement these ITIL best practices. This paper gives a high-level overview of the ITIL processes, and shows how SPECTRUM maps to these processes to support your ITIL efforts.

 

 

2 Introduction to ITIL

ITIL is the IT Infrastructure Library which was originally defined by the Central Computer and Telecommunications Agency of the UK government (CCTA) in the 1980s. The CCTA has become the Office of Government Commerce (OGC) and now owns ITIL. The Information Technology Service Management Forum (itSMF) is an international, independent user group that has become a major influence on the best practices of IT Service Management and has embraced ITIL to do so. itSMF also continues to contribute to ITIL.

 

 

2.1.1 Overview of the ITIL Processes

ITIL provides detailed process definitions for many IT functions that can be adapted to any IT organization. Actually, most processes defined in detail by ITIL are already partially implemented in most IT organizations. The main focus of ITIL processes is on IT Service Management. ITIL consists of a set of 11 Processes and 1 Function all working together to deliver effective IT Service Management. The sections defined by ITIL are:

·          Service Desk. This is the function described in ITIL and is the initial point of contact between the IT organization and users. It is responsible for many ITIL Processes · Incident Management is the process focusing on solving incidents and restoring services quickly.

·          Problem Management is the process focusing on solving root cause problems to prevent future incidents.

·          Configuration Management is the process that keeps all required information about services, service components, relationships, and other items accurate and up to date.

·          Change Management is the process that controls the implementation of changes in the infrastructure. · Release management is the process that controls the rollout of new releases in the infrastructure.

·          Service Level Management is the process that defines and implements clear agreements for service delivery between the customers and the IT organization. · Financial Management for IT Services is the process that ensures the sensible management, maintenance, and operation of IT Services from a financial standpoint. · Capacity Management is the process that optimally manages capacity to meet the service requirements at an acceptable cost.

·           Availability Management is the process that ensures the availability of IT resources and hence the availability of IT Services to meet a greed upon service levels. · IT Service Continuity Management is the process that focuses on defining and maintaining appropriate disaster recovery plans for IT Services. · Security Management is the process that ensures the proper access to services as defined by the service agreements and prevents unauthorized use.

 

It is important to note that all these processes are heavily interrelated and dependent upon one another. There are typically blurry lines where one process ends and another starts. A single example of this interdependency is that Incident Management, Configuration Management, Problem Management, Release Management, Service Level Management, Availability Management, Capacity Management, and IT Service Continuity Management are all directly linked to Change Management.

 

2.1.2 Benefits of ITIL

There are many benefits associated with the successful implementation of ITIL. ITIL best practices allow IT organizations to deliver the optimal service levels to their customers based on balancing the performance and cost of the services with the business requirements. Relationships between the provider (internal or external) and customer are also improved through added customer focus and through SLAs allowing both parties to have a mutual understanding of the requirements and the delivery. Additionally, implementing ITIL best practices makes IT Service Management more efficient, again by focusing on delivering the required business services at the agreed upon service levels and by having well defined processes for performing the required management tasks. Using the well defined processes and best practices helps eliminate problems while increasing service levels.

 

 

2.1.3 ITIL Adoption

ITIL is most prevalent in Europe as this is where it was initially developed. However, ITIL implementation is quickly gaining momentum worldwide as businesses are moving toward IT Service Management and looking for industry best practices to help them be effective and provide the optimal service delivery. ITIL implementation is not limited to any vertical industry; it is being embraced by service providers, enterprises, and many government and military organizations. ITIL is particularly prevalent in businesses who are signing up to Service Level Agreements (SLAs) whether these are for internal service delivery within an organization or between customers and external service providers.

We are dedicated to helping with the adoption of ITIL best practices through an extensive product suite enabling effective service management and through business practices that support a customer’s ITIL processes.

 

 

3 Concord’s Support for ITIL in IT Service Management

This section outlines each process and how the Concord SPECTRUM Business Unit helps support it in an IT organization.

3.1 Service Desk

The service desk is actually a function not a process. It acts as the single point of contact for all users and can perform multiple ITIL processes. The goal is to support the services that are to be delivered to the users and ensure timely response to issues. As the first line of support, the service desk reduces the workload on the 2nd, 3rd, nth levels of support by taking care of the simple calls, and provides better response to users since they only have one contact for reporting and following up on issues.The Service Desk primarily handles the Incident Management process; however it also plays a role in Release Management and Change Management.

Responsibilities of the service desk include:

·          Responding to user’s calls including logging and tracking all calls. The two types of calls are Incidents and Standard. Incidents include errors which are actual faults in the IT Services and Service Requests which are requests for information regarding the service such as how to perform a task, requests to reset passwords, requests for replacement of non-tracked equipment such as power cords, etc. Standard requests include changes such as PC upgrades, network connection changes, account setup, etc. Responding to user’s calls also includes resolving any incidents that have standard solutions and sharing information with other ITIL processes where appropriate.

·          Providing proactive information such as current or expected errors and information regarding services and SLAs to customers and users. · Communicating with suppliers for replacement of hardware and software

·          Performing operational tasks such as backups, account creation, password creation and resetting, new network connections, etc. · Monitoring the infrastructure for faults and root causes to faults and automatically notifying Incident Management .

 

The SPECTRUM product suite supports the service desk in many ways. SPECTRUM monitors the IT infrastructure and uses root cause analysis to quickly pinpoint faults. Using the One Click Console and Service Dashboard, service desk personnel have the information at their fingertips to not only answer calls to the service desk quickly with the proper information, but also act proactively to communicate problems and help fix them before users call. This is expedited through Spectrum’s probable cause and recommended actions files which help service desk personnel understand solutions to the problems. The Remedy Gateway and other trouble ticket integrations automatically create tickets for SPECTRUM found issues and also allow bi-directional functionality to automatically clear alarms in SPECTRUM as tickets are closed or automatically update tickets as alarm status changes. This eliminates much of the manual effort required by the service desk to streamline the operation. Further, SPECTRUM Alarm Notification Manager (SANM) and Attention provide automated alarm escalation and notification so the appropriate people are aware of the issues and can address them in an efficient manner again reducing manual workload of the service desk.

 

3.2 Incident Management

As defined by ITIL, “an incident is any event which is not part of the standard operation of a service and causes, or may cause, an interruption to, or a reduction in the quality of that service.” This differs from a problem in that an “incident” has a known workaround or fix while a “problem” is the underlying root cause of an incident that when fixed, will prevent future incidents. Incidents include service requests to do things such as reset passwords or provide documentation in addition to errors in the infrastructure.

The goal of Incident Management is to restore services to guaranteed levels quickly and efficiently to limit the impact of incidents on business processes. Further, Incident Management maintains logs of incidents and their solutions to make it easier to solve similar incidents in the future and make this information available to other processes.

Responsibilities of Incident Management include:

·          Accepting and recording incidents whether they are found by a user, management system, or IT personnel. Each incident is recorded with a tracking number, basic yet detailed information about the incident (such as time, affected users, affected service, etc), and supplemental information from other sources. Incident Management also alerts other impacted users for high impact incidents.

·          Classifying incidents with a category (type of incident), priority based on urgency and impact, related services, time to repair requirements of the SLA, and incident identification.

·          Matching new and previous incidents looking for a workaround used for a similar incident in the past. · Investigating and diagnosing incidents to find a solution.

·          Resolving and recovering incidents including submitting Request for Changes (RFCs) to the Change Management process. · Monitoring and tracking incidents to keep users and other ITIL processes up to date with any changes. · Closing incidents by verifying the solution with the person who reported the incident.

 

In support of this process, the SPECTRUM integrated application suite detects incidents automatically and finds the root cause often before users report it to the Service Desk. OneClick Console and the Service Dashboard put this information at the fingertips of the Incident Management staff enabling a quick turnaround, thus limiting the performance and availability impact on the service. The information provided includes the details required to classify the incident including the priority based on the impact to business services and customers.

To aid in accepting and recording incidents, trouble ticket software integrations simplify the ticketing process by automatically creating tickets based on incidents found by SPECTRUM Service Manager. The integration with Remedy also allows automatic ticket closing when incidents have been resolved.

SPECTRUM Alarm Notification Manager and an integration with Attention software also help automate escalations ensuring the appropriate support personnel are notified of the incidents in a timely manner allowing faster solution. Once notified, the support groups can take advantage of SPECTRUM Report Manager and the OneClick console to easily study current incidents, match them to previous incidents, and find solutions.

 

 

3.3 Problem Management

As defined by ITIL, “a problem is the unknown underlying cause of one or more Incidents. It will become a Known Error when the root causes are known and a temporary workaround or a permanent alternative has been identified.”

The goal of Problem Management is to find the root causes of errors and potential errors and make sure they are fixed to limit incidents in the future. This goes a step further than Incident Management where Incident Management only resolves the single incident whether this includes fixing the root cause or not.

 

 

By limiting incidents it not only allows the business processes to run more smoothly, increasing customer satisfaction and profit opportunity, but also reduces workload on the IT organization.

Responsibilities of Problem Management include:

·          Problem control to identify problems and diagnose them to find the root cause. This step includes recording problem details including the related incidents. · Error control to monitor and fix known errors where appropriate based on SLAs, costs to fix, and impact.

·          Proactive Problem Management focusing on quality of the IT Infrastructure using trend analysis to find potential problems before incidents occur. · Providing information regarding known errors and workarounds to other ITIL processes.

 

 

 

Supporting this process, the SPECTRUM suite of products identifies and isolates the root causes of the problems in the infrastructure, allowing them to become known errors. The SPECTRUM core analysis engine leverages patented fault isolation and root cause analysis capabilities to determine problems on an infrastructure component basis. SPECTRUM’s Service Manager application enables root cause correlation on a service and customer basis. Further, Service Manager prioritizes issues based on impact and urgency allowing the root cause issues to be addressed effectively and efficiently. In cases where human intervention is required for troubleshooting and finding the root cause, the SPECTRUM suite provides many troubleshooting and analysis tools to aid this process such as those in OneClick Console and statistical reports from Report Gateway.

SPECTRUM products also enable Proactive Problem Management through trend analysis. Report Gateway identifies abnormal patterns and trends in any statistical data collected by SPECTRUM, proactively identifying things that are possible problems for the future. Further, Service Performance Manager (SPM) proactively monitors traffic to find performance trends that may become problems in the future. Dynamic thresholding even enables alarms to be generated as trends vary too far from normal.

 

 

3.4 Configuration Management

Information is crucial to making decisions. Without detailed, up-to-date, and accurate information of the IT infrastructure, poor decisions are made. The goal of the Configuration Management process is to ensure all information regarding the IT Infrastructure, its components, and services it supports is detailed, up-to-date, and accurate. The information includes details about each specific component (Configuration Item - CI), as well as the relationships between each CI, and is stored in the Configuration Management Database (CMDB).

Responsibilities of Configuration Management include:

·          Planning to define the strategy, process, and objectives of Configuration Management.

·          Identification where the data models are defined to record the appropriate information on each CI and the relationships with other CIs. It also includes defining the methods of adding new CIs and updating existing CIs to ensure the CMDB is always up to date and has the appropriate information.

·          Control to ensure only authorized changes and additions are made to the CMDB.This requires appropriate approval and documentation such as an approved Request for Change in order to make any changes to the CIs.

·          Status Monitoring to indicate the status lifecycle of each CI from under development, test, stock, live use, to phased out. · Verification to audit the infrastructure and verify the CI’s existence and actual information to ensure the CMDB is correct. · Reporting to provide information on the CIs and their relationships to the other ITIL processes.

 

 

The SPECTRUM product suite helps with Configuration Management by providing information on what should be stored in the CMDB, the relationships between components, customer and owners, and in auditing the IT Infrastructure to ensure the CMDB is accurate and up-to-date.

SPECTRUM Report Manager identifies information required for CIs based on SPECTRUM’s in depth knowledge of the infrastructure. This information includes things such as:

·         What IT Components are being used with asset usage reports

·          What needs to be upgraded with asset reports showing current versions of firmware and hardware · The status of different components and their availability to see which components are contributing to problems

·         Inventory reports to help verify the CMDB

In addition to Report Manager, Service Manager manages the relationships between the IT components allowing them to be effectively managed. It provides information such as what will be affected by a change and what needs to be considered. This includes determining maintenance times based on the effect maintenance of a CI will have on the business services. It also relates the CIs to services and customers thereby relating them directly to the business. This information may be used by financial management.

Finally, SPECTRUM Configuration Manager helps configuration management by verifying configurations of components to be sure they have not been changed without updating the CMDB, helping to keep the CMDB in sync with the actual environment.

 

3.5 Change Management

“Not every change is an improvement, but every improvement is a change” - Motto of Change Management from ITIL. A change is anything from a minor installation to a major reconfiguration and service rollout. While some changes can be planned well in advance, others are required quickly to fix an incident or a major problem. Planned or unplanned, without a good change management process, changes are often the cause of many incidents, thereby consuming unnecessary resources for troubleshooting and correction. The goal of change management is to manage the change process so changes can be done quickly and with little impact on the services thereby limiting the incidents resulting from the change.

The responsibilities of Change Management include:

·         Recording all Requests for Change (RFCs). An RFC is submitted for every non-standard change. A well known and clearly defined modification, such as a password reset, can be handled with a Service Request to the Service Desk.

·          Accepting RFCs to ensure all the required information to make decisions is included. If the required information is not included, the RFC is rejected and must be resubmitted.

·          Classification of the RFCs after they are accepted where a priority and category (minor, substantial, or major impact) is assigned. · Planning of changes to ensure they are implemented in the required timeframe with the proper approvals and understanding the true impact and required resources. · Coordination of the change with the appropriate IT personnel to make the change. Making the change includes building, testing, and implementing the change.

·          Evaluation of the change after it is implemented to determine if it met the required objective, if users are satisfied, if there were side effects of the change, and if costs and resources were within budget.

·          Implementing urgent changes where there is not time to follow the full process. In this case, the evaluation and other steps must be completed following the change in order to ensure traceability of the change.

 

 

The SPECTRUM product suite contributes to the evaluation of changes. Particularly, SPM and SRG (as well as core SPECTRUM) help IT understand the impact of any changes on the rest of the infrastructure.This is evaluated through comparing trends before and after the change. Further, they help identify if the change is successful, particularly in cases where a change is made to correct a performance issue.

In addition to products supporting Change Management, we provide different product release mechanisms to fit within the change management process. Often times, changes are required to fix a small problem and need to be done quickly. In this case, we provide hot-fixes. Also, service packs and full releases are provided to allow bundled changes to the software to make many changes at once in a controlled environment.

 

 

3.6 Release Management

The goal of release management is to have a process to be sure all technical and non-technical aspects of a release are covered when planning for a release. These processes are intended to ensure a reliable production environment to deliver high quality services. Though it is related to Change Management, it is concerned with the implementation of the changes through releases.

 

Responsibilities of Release Management include:

·          Release policy and planning including details from figuring out all the dependencies, coordination of the release with schedules and communication plans, determining the testing plans, and other details related to the release.

·          Design, building, and configuration of the release. This includes testing the release in a lab environment and creating full documentation on how to recreate the configuration help ensure a successful implementation in production. This activity also includes creating a back-out plan in case there are problems during implementation that require the release to be aborted and to restore the previous environment.

·           Testing and release acceptance includes functional testing by a subsection of the users and operational testing by IT.

·           Implementation planning augments the release planning by including all the details for task schedules and required resources, all items that must be changed, communicating to all affected people, etc. Releases are either done in one full release or in stages.

·           Communication, preparation and training for all parties who communicate with customers and users to ensure they can communicate the right information. Further, changes to all SLAs, OLAs, and Underpinning Contracts must be communicated as early as possible prior
to the release.

·           Release distribution and installation includes the actual rollout. Rollout includes purchasing the appropriate equipment, implementing it, ensuring the Configuration Management DB is updated, and verifying a successful implementation.

 

Although we do not provide specific “release management” software, the SPECTRUM products and processes enable release management with regard to the management system. Lab licenses are available at a reasonable cost to enable our customers to acquire a duplicate of their production environment for their lab, enabling complete SPECTRUM release testing for upgrading to new versions, rolling out new modules, or making changes to configurations.We also provide bundled service packs that are fully tested for migration. Further, add-on products are included within service pack releases allowing customers to roll out new products without a major release of SPECTRUM.

We also provide a best practices guide for installation and upgrades of SPECTRUM products to help with release planning and implementation. Finally, we have focused development on providing simplified installation and administration for maintaining customizations during major and/or service pack upgrades, allowing distributed installation, and producing the OneClick architecture with single point of administration.

 

 

3.7 Service Level Management

The Service Level Management process is more involved than many IT people think going beyond simply defining a technical service and monitoring that service. It includes everything from negotiating with the customers, defining the service with regard to the customer requirements, managing and measuring the service and improving quality over time; all within an acceptable cost structure. Service Level Management creates a relationship between the IT Service Organization (provider) and the customer allowing them to agree upon the level of service that will be delivered in order to meet the business and financial needs. It also allows a mapping between the technical requirements to deliver the service (for the IT organization) and common business language describing the service and the guarantees (for the customer).

Responsibilities of Service Level Management include:

·          Developing a Service Catalog to provide details of services offered; written in language that can be easily understood by customers. · Identifying customer needs includes understanding their business processes and requirements to enable creation of the proper services and SLAs.

·          Defining services to meet the needs identified with the customer. The services are documented in the Service Level Requirements, in a non-technical language that can be understood by the customer. Specification Sheets including both the detailed customer requirements and how this will impact the IT Organization are also developed.

·          Creating contracts required for each service:

o        Service Level Agreements between the provider and customer.

o        Operational Level Agreements between internal organizations required to deliver the service levels.

o        Underpinning Contracts between the provider and any external suppliers required to deliver the service at the guaranteed levels.

·          Implementing, monitoring, and reporting on the services on a regular basis.

·          Creating Service Improvement Plans (SIPs) to continuously improve IT Services helping both the provider and the customer remain competitive.

 

 

The SPECTRUM suite of products plays a role in many aspects of the Service Level Management process. The product suite is focused on the technical side of Service Level Management after the SLAs have been negotiated and defined between the provider and the customer. Once defined and the technical requirements have been ironed out, SPECTRUM, SPM, iAgent, and Service Manager are configured to monitor and manage the services and agreements (SLAs, OLAs, and UCs). Service Manager's templates simplify this process by enabling providers to align the service templates with their service catalog and SLA templates with their typical SLAs allowing quick implementation of service and SLA monitoring.

Service monitoring and management is accomplished through SPECTRUM's Service Correlation and Fault Management, enabling providers to quickly find the root cause of incidents and problems, enabling Incident Management to quickly resolve incidents to maintain agreed upon service levels. SPM and other SPECTRUM products integrate performance, system, application, and other traditional management silos to Services and SLAs.

Part of managing and monitoring services includes presenting the results to the service managers, service desk, customers, and others.The Service Dashboard provides real time status of services and SLAs. Service Manager and SPECTRUM Report Manager provide the historical reports on the services and SLAs.These reports are required for the Service Quality Plan to show how the service is meeting the SLA on a regular basis. All reports may be scheduled on a regular basis and/or done ad-hoc. Further, the service and SLA reports provide valuable input to the SIP by indicating how the services have been performing and any details regarding incidents down the root cause. This allows the service manager to easily identify areas for improvement and create SIPs.

 

Finally, detailed reports are provided to help with the service reviews with both customers and internal IT to determine ways to improve the service by showing not only service availability and performance, but also the root cause of any incidents. This information can be used by Problem Management to recommend changes to the infrastructure to better meet the SLAs and provide better services at a greater profit.

 

3.8 Financial Management for IT Services

 

The goal of Financial Management for IT Services is to enable IT to provide services cost-effectively through the financial management of IT resources required to deliver the service. This includes budgeting, accounting, and charging. Through this process both IT organizations and customers alike become more aware of the true costs of delivering the services, allowing them to make better business decisions.

 

Budgeting is based on planning for customers' needs and determining the cost associated with delivery of services to the customers. Accounting consists of tracking the expenditures of IT. It specifically tracks the cost by customer, service, etc. Without Budgeting and Accounting, it becomes difficult to make business decisions that balance cost of services with service quality.

 

Charging includes all things required to bill the customer from the objectives to the actual calculation methods. The advantage of charging, whether to internal customers or to external customers, based on the quality of the service is that it improves the provider/customer relationship. It opens up negotiations regarding what will be provided for what price and allows customers and providers to make informed business decisions based on cost vs. quality.

 

Although SPECTRUM products don't provide budgeting, accounting, or charging tools, there are many pieces of SPECTRUM such as integrations, SLAs, etc. that indirectly support financial management through other ITIL processes that we support.

 

 

3.9 Capacity Management

The goal of Capacity Management is to provide adequate capacity to meet the SLAs in a cost effective manner. In order to reach this goal :

·          Avoid overspending on capacity that goes unused.

·         Avoid guessing on what will meet the service needs.

·          Plan for upcoming needs enabling better purchase decisions.

 

 

The Capacity Management process is made up of three sub-processes:

·         Business Capacity Management, taking the customers’ business plans into account to predict future capacity needs. After understanding     the customer business needs, providers typically determine the capacity required to support new or modified applications by using models and simulations to determine the requirements.

·         Service Capacity Management to ensure SLAs can be met based on current and peak loads on the infrastructure. This is done through monitoring and analyzing performance of the monitored services and comparing to the resource load determined in Resource Capacity Management.

·         Resource Capacity Management to understand the use of the IT infrastructure by monitoring trends on the resources needed to provide the services. This sub-process monitors metrics such as CPU utilization, disk space, bandwidth utilization, etc.

 

In both Service and Resource Capacity Management, trending is used to analyze the monitored data to predict future trends for capacity planning. Also, based on the trend information,  these processes are responsible to tune the infrastructure to best meet the capacity needs and required changes are made within the Change Management process.

We address the Service and Resource Capacity Management sub processes. SPECTRUM, Service Manager, iAgent, and SPM are used to monitor and manage the infrastructure to ensure the agreed upon service levels are met. Root Cause Analysis through advanced correlation allows the provider to focus on the real problems when they occur, while understanding service and customer impact.

Service Manager, Report Manager, and Report Gateway all play a roll in analyzing data to predict future utilization and highlighting capacity available for other use. First, Service Manager is used to indicate the impact of putting a single infrastructure device in maintenance mode, showing what services and SLAs are impacted by a single device or group of devices (ITIL defines this as Component Failure Impact Analysis (CFIA)). SPECTRUM Report Gateway provides statistical trending reports for performance allowing performance analysis to see where capacity is available or where more is needed. This includes data from SPECTRUM, SPM, and iAgent for systems and applications. Finally, SPECTRUM Report Manager provides asset reporting giving an understanding of the capacity of devices in the infrastructure, highlighting unused capacity and showing when available capacity is getting low.

With respect to tuning the infrastructure for optimization based on trend data, we enable capacity management on the SPECTRUM products. For example, OneClick allows monitoring license use for all new products; understanding current, average, and peak usages. Also, the Proof of Concept process allows customers to meet the Capacity Planning requirement for testing of the system in their own lab environment before implementing.

 

3.10 IT Service Continuity Management

The main goal of IT Service Continuity Management is to support Business Continuity Management by ensuring timely restoration of IT infrastructure and services following a disaster.

ITIL defines a disaster as: “An event that affects a service or system such that significant effort is required to restore the original performance level.” Some examples of disasters are fires, floods, burglary, large scale power outages, etc. and by nature are more serious than an incident.

Responsibilities of IT Service Continuity Management include:

·          Determining the scope of IT Service Continuity Management based on business requirements.

·          Performing Business Impact Analysis to determine the impact of a disaster and how long the business can survive lacking any particular IT Service. · Performing Risk Assessment to determine exposure based on assets, threats, and vulnerabilities.

·          Developing the IT Service Continuity Strategy based on a combination of preventative measures and recovery options.

·           Implementing the plan, including the preventative measures and recovery options, testing, training, and reviewing and auditing, and assurance.

 

The SPECTRUM product suite helps with IT Service Continuity Management in multiple ways. SPECTRUM and Service Manager are capable of detecting IT Disasters, providing the root cause analysis, and the impact of the disaster. This allows the IT Service Continuity team to initiate the proper portions of the Disaster Recovery Plan in a timely manner.

Further, SPECTRUM products themselves are also designed to work well within a high availability IT environment allowing timely recovery from disasters that affect the infrastructure management system. Fault Tolerant licenses are available for all products and included in both Integrity and Infinity products allowing infrastructure management to be assumed by the fault tolerant server (this could be at a separate physical location) in the event a disaster impacts the primary server. Also, distributed configurations distribute the workload so any one system impacted by a disaster does not have as great an impact on management of the entire infrastructure.

 

 

3.11 Availability Management

The main goal of Availability Management is to cost-effectively provide the level of availability of IT Services required for successful business operation; typically defined in the SLA. Availability consists of service up-time, degraded time, time between failures, service recovery time, and other metrics related to a user’s ability to effectively use the service. In order to have effective Availability Management, the business aspects of the service and requirements must be well understood. Also, up front planning must take place in order to ensure the service will be capable of providing high availability and the service must be monitored to ensure availability agreements are met.

Availability Planning includes:

§          Determining availability requirements as the first step prior to defining SLAs. This must take the business requirements of the customer into account and
  be done for any new service as well as any service changes. It includes quantifiable requirements and impact.

§          Designing for availability and recoverability using the methods for risk assessment used in the IT Service Continuity Management process as well as Component Fault Impact Analysis used in the Capacity Management process. These are used to ensure the requirements can be met at an acceptable cost. If they cannot, it must be determined if the design can be modified to meet them or if the requirements need to be renegotiated with the customer. This

§ responsibility also includes designing the monitoring and management system as monitoring components of a service and restoring the service are an important part of maintainability and recovery that must be considered in the design phase.

§  Security Issues such as access to information need to be planned up front as poor security planning can affect the availability of a service.

§          Maintenance Management is taken into consideration as it is required for continued high availability of a service through upgrades and changes. The maintenance windows are typically designed to set them at times where there will be the least impact to the service. These maintenance windows must be respected by the provider.

§          Developing the Availability Plan which is a living document defining the current requirements and methods as well as improvement and maintenance guide lines for the future.

 

Monitoring includes measuring and reporting service availability. Reporting provides the baseline information required to validate the service agreements, as well as, solve problems and identify improvement opportunities. Availability monitoring for services includes uptime, degraded time, Mean Time Between Failure (MTBF), Mean Time To Repair (MTTR), and Mean Time Between System Incidents (MTBSI). MTBF and MTBSI are similar. The difference is that MTBF is the time between the end of the first failure and the start of the next where MTBSI is the time between the start of the first failure and the start of the next. Dividing MTBF by MTBSI, provides an indication if there are a large number of minor failures or a small number of major failures.

 

The SPECTRUM product suite contributes to the Availability Management process in many ways. First, Service Manager measures and reports on service availability for the Availability Management process. Service Manager also isolates and presents the root cause of service failures significantly reducing the amount of unavailability by improving the ability to respond quickly to faults. Service Reports provide data about faults from previous incident and problem records, required by Availability Management, allowing availability improvements. This data is provided as summary information, as well as, detailed information about every fault contributing to service availability problems.Also, to improve availability, service reports are generated showing every infrastructure item that contributes to a service outage, along with statistics on its contribution such as number of outages caused, number of services affected, etc.

 

SPECTRUM's maintenance mode combined with Service Manager's ability to incorporate planned maintenance windows within SLAs, supports Availability Managements requirements to schedule and respect planned downtime. Further, Service Manager shows the impact to services and customers impacted by maintaining a specific device or group of devices in order to plan the best maintenance windows with the least amount of impact. In addition, with respect to availability of the Management System itself, SPECTRUM is easily deployed in a fault tolerant mode for a high availability management service with fast recovery times.

 

3.12 Security Management

The goals of Security Management are to meet all external security requirements including those in SLAs, contracts, legislation, etc. and to maintain base level security in IT. The Security Management process includes:

·          Confidentiality to ensure only the authorized users can access the information

·          Integrity to ensure the information is correct

·          Availability to allow access to information when it is required, as defined in the SLA.

The Security Management process is responsible for setting and maintaining, as well as implementing the security policy. It is a continuous cycle of planning, implementing, evaluating, and starting all over again with planning based on the maintenance requirements gathered from the evaluation. This is a cyclical process since the IT and Security landscape is always changing.

Although SPECTRUM products do not directly support the Security Management process, these products fit well within a secure infrastructure. Security is built in as part of the SPECTRUM One Click UI Architecture both by requiring username and password authentication as well as providing SSL encrypted connections from the client to the web server. In addition, the web server is the only client to the SPECTRUM Assurance Server allowing a trusted connection between the two and allowing them to be located inside the same trusted domain.

One Click Console with all integrated UI components, Service Dashboard, and Report Manager offer the ability to set up user Privileges defining what specific users or groups of users can and cannot see or do. This supports confidentiality as well as integrity by removing the danger of unauthorized users seeing information they should not have access to or making changes that could affect the management of the infrastructure. For example, the administrator may have privileges to edit services and SLAs, schedule reports, and configure response time tests. While some users may not have privileges to do any of these and others may have privileges to do a subset of them. Further, the administrator can define information access permissions including access to managed devices, specific reports, or even views within the user interface. Some users or customers may only have a view of the topology while others may be able to see alarms, etc.

Finally, SPECTRUM takes advantage of SNMPv3 to provide secure management of any environment using SNMPv3 on the infrastructure devices.

 

4- Summary

ITIL is quickly becoming popular with IT organizations as they look to make IT Service Management more effective in today’s business world, heavily reliant upon IT Services. Implementing ITIL helps ensure the appropriate service levels are delivered at an acceptable cost through proper business planning, relationship management, and IT management.

We are dedicated to supporting ITIL efforts in your organization. As outlined in this paper, the SPECTRUM product suite and business processes significantly contribute to your successful implementation of the ITIL best practices for IT Service Management.